^  AD*A11«  764  FEDERAL  AVIATION  ADMINISTRATION  WASHINOTON  DC  OFFICE  —ETC  F/6  5/9 

A  GENERIC  MODEL  FOR  EVALUATION  OF  THE  FEDERAL  AVIATION  AOMXNIST— ETC(U) 
MAR  82  J  0  BOONE 

UNCLASSIFIED  FAA-AM-82-2  NL 


MICROCOPY  RESOLUTION  TEST  CHARI 

NAllONAL  RllWIAtl  (if  SlANlURJls  A 


2.  Gev«rnm*fit  Acc^ttion  No. 


Technical  Roport  Documantotion  Pag* 


3.  Rocipionf  s  Cotolog  No. 


-ill'll^ 


4.  Tiflo  ond  Subtitio  5.  R^ort  Dot# 

A  GENERIC  MODEL  FOR  EVALUATION  OF  THE  FEDERAL  „  .  jog- 

AVIATION  ADMINISTRATION  AIR  TRAFFIC  CONTROL  SPECIALIST  6.  P.rfo,m.n.  o,..ni..,.on  Cod. 
TRAINING  PROGRAMS 


7.  Author'*) 

James  0.  Boone 


▼.  Porforming  Orgoni  lotion  Nomo  and  Addroii 

FAA  Civil  Aeromedlcal  Institute 
P.O.  Box  25082 

Oklahoma  City,  Oklahoma  73125 


12.  Sponsopin9  Agoncy  Nomo  ond  Addrost 

Office  of  Aviation  Medicine 
Federal  Aviation  Administration 
800  Independence  Avenue,  S.W. 

Washington,  D.C.  20591 _ 

15.  SupflMOntory  Notoo 

Work  was  performed  under  Tasks  AM-C-80/81-PSY-87. 


6.  Porforming  Orgonizofion  Roport  No. 


10.  Work  Unit  No.  (TRAiS) 


11.  Controct  or  Gront  No. 


13.  Typo  of  Roport  ond  Poriod  Covorod 


14.  Sponsoring  Agoncy  Codo 


14.  Absttoot 

The  Systems  Analysis  Research  Unit  at  the  Civil  Aeromedlcal  Institute  (CAMI)  has 
developed  a  generic  model  for  Federal  Aviation  Administration  (FAA)  Academy  training 
program  evaluation.  The  model  will  serve  as  a  basis  for  integrating  the  total  data 
base  into  a  common  format  across  all  training  programs.  The  model  consists  of  four 
components:  (1)  design,  (2)  implementation,  (3)  formative,  and  (4)  summatlve 
evaluation.  Design  evaluation  is  an  assessment  of  the  comprehensive  implementation 
plan;  Implementation  evaluation  Is  a  determination  that  the  plan  is  completely  and 
accurately  Implemented  according  to  prescription;  formative  evaluation  is  a  continual 
monitoring  of  the  program  to  keep  the  process  reliable,  stable,  and  on  track;  and 
sunmatlve  evaluation  monitors  the  product  of  the  training  program.  The  design 
evaluation  relies  on  the  task,  knowledge,  and  skills  analysis  and  the  documents  in 
the  implementation  plan.  The  Implementation  evaluation  makes  use  of  the  data  from 
frequent  status  studies.  Formative  and  summatl)^'-  aluatlons  make  use  of  statistics 
and  mathematical  modeling,  primarily  linear  reg:  models,  to  monitor  the  process 

and  products  of  the  programs  and  to  estimate  and  .  ne  the  Impact  of  changes  made 

to  the  programs. 


17,  Kay  Worrft 

Program  Evaluation 
Math  Modeling 
Training 

Air  Traffic  Control 


Dittrlbvtion  Stvtwnmt 

Document  is  available  to  the  public 
through  the  National  Technical  Information 
Service,  Springfield,  Virginia  22161. 


If.  S•c«rity  CiMtIfa  (of  tbit  roport) 

Unclassified 


Pens  DOT  P  1700.7  (•-73) 


30.  S«««rity  (of  .hit  gaga) 

Unclassified 


RgpraAictlon  of  eomslafafl  pogo  oufliorltad 

i 


21*  No.  of  rogot 

29 


ACKNOWLEDGEMENTS 


Acknowledgements  are  given  to  Lelnnd  Page  for  providing  Figures  2,  3, 

4,  5,  and  6,  to  Allan  VanDeventer  and  Linda  Rltchlo  for  pro'/iding  the  report 
formats  in  Appendices  A  and  B,  and  to  Jo  Ann  Steen  for  the  preparation  of 
this  manuscript* 


Accer^rlrn 

r'i::;  V 

i  - 


- - -  ;  , 

J'  •  r  *  r  *  V’*  •  ' 

Aval 


4.  ‘ 


A  GENERIC  MODEL  FOR  EVALUATION  OF  THE  FEDERAL  AVIATION 
ADMINISTRATION  AIR  TRAFFIC  CONTROL  SPECIALIST  TRAINING  PROGRAMS 


I.  Introduction. 


In  a  large  training  Institution  such  as  the  Federal  Aviation 
Administration  (FAA)  Academy,  several  Independent  training  programs  operate 
simultaneously.  As  new  technology  becomes  available  for  training, 
especially  In  the  computer  field,  new  training  methods  are  frequently 
Implemented.  The  new  simulation  facility  for  radar  Air  Traffic  Control 
Specialist  (ATCS)  training  and  the  new  PLATO  computer-based  Instruction 
system  are  examples  of  these  advances.  It  Is  redundant  and  Incoherent  to 
develop  a  new  program  evaluation  for  each  new  development  In  ATCS  training 
methods.  Consequently  the  Systems  Analysis  Research  Unit  at  the  Civil 
Aeromedlcal  Institute  (CAMI)  has  developed  a  generic  program  evaluation 
model  for  Academy  training  programs.  While  the  ATCS  training  programs  were 
the  primary  aim  of  the  model.  It  Is  appropriate  for  Airway  Facility  or 
Flight  Standards  training  programs.  The  generic  model  allows  research  at 
CAMI  on  Academy  programs  to  be  Integrated  Into  our  total  systems  approach  by 
making  specific  application  of  the  generic  model  to  any  new  Academy 
development.  By  consistent  application  of  the  generic  model,  the  data 
collected  on  programs  will  be  compatible  with  our  continuing  data  base  and 
offer  a  means  of  expanding  our  total  picture  of  Academy  training  programs  In 
an  Integrated  fashion. 


II.  Description  of  the  Program  Evaluation  Model  Components. 

Program  evaluation  is  designed  to  accomplish  several  tasks.  These 
tasks  are  to  (1)  define  exactly  what  the  program  Is,  Its  purposes  and  goals, 
(11)  document  the  exact  structure  of  the  program,  (111)  define  the  process 
In  the  program  (a  logical  step-by-step  explanation)  that  achieves  the  goals, 
(Iv)  monitor  the  process  to  Insure  that  any  breakdown  In  the  program  during 
Implementation  or  operation  can  be  Identified,  (v)  measure  the  outcomes  of 
the  program  to  determine  If  It  Is  accomplishing  Its  goals,  and  (vl)  define 
and  document  any  program  revisions  made  to  change  the  process.  Including  the 
basis  for  the  change  and  how  this  alters  the  structure  and  paths  to  produce 
the  desired  results.  This  paper  describes  a  generic  model  for  ATCS  training 
program  evaluation.  The  four  components  of  the  model  are  (1)  design 
evaluation,  (11)  Implementation  evaluation,  (111)  formative  evaluation,  and 
(Iv)  summative  evaluation  (3). 

Program  design  and  Implementation  evaluations,  as  the  terms  Imply,  occur 
at  the  beginning  of  the  program.  Formative  and  summative  evaluations  occur 
simultaneously  and  serve  to  evaluate  the  process  and  course  of  the  program 
as  well  ss  Its  products.  Each  of  these  evaluation  components  uses  the 
techniques  of  statistics,  math  modeling,  and  various  reporting  systems. 

Deslan  Evaluation.  Program  design  evaluation  Involves  Insuring  the 
proper  development  of  several  tasks  that  make  up  the  program  Implementation 


plan.  First,  the  overall  objectives  of  the  program  must  be  clearly  defined. 
Every  expected  outcome  of  the  program  should  be  listed.  The  outcomes  should 
be  organized  by  broad  categories  and  related  to  the  objectives  of  the 
program.  All  curricula  objectives,  student  assessment  techniques  and 
Instruments,  and  teaching/ training  lesson  plans  must  be  based  firmly  on 
thorough  task,  knowledge,  and  skills  analyses.  A  task  analysis  is  a  careful 
documentation  of  all  the  tasks  performed  in  controlling  air  traffic  and 
their  relative  Importance  and  interaction.  A  knowledge  and  skills  analysis 
is  a  determination  of  the  knowledge  and  skills  and  knowledge  and  skill 
levels  required  to  perform  each  task.  Consequently,  the  task,  knowledge, 
and  skills  analyses  serve  as  the  precise  and  clear  job  sample  on  which  the 
student  curricula,  assessment,  and  teaching/ training  lesson  plans  are  based. 
This  is  a  very  crucial  and  Important  step. 

Next  the  teachlng/tralnlng  and  assessment  process  or  methods  must  be 
operationally  defined.  This  involves  a  logically  connected  step-by-step 
explanation  of  the  methods  to  be  employed  in  accomplishing  each  of  the 
outcomes  and  measuring  the  accomplishment  of  each  of  the'  objectives.  This 
should  include  the  use  of  any  teaching  equipment  or  aids.  Flowcharts,  PERT, 
tables,  GAMTT  charts,  and  graphs  should  be  used  as  appropriate  in  defining 
the  process.  Careful  documentation  of  every  step  should  be  made  during  this 
evaluation  phase  by  the  evaluation  staff  with  regular  reports  to  the 
responsible  supervisor  on  the  progress  of  the  design.  The  completed 
implementation  plan  should  be  clear  enough  that  any  competent  educational 
expert  could  carry  out  the  design.  Figure  1  illustrates  the  process  of 
specifying  the  program  design  requirements. 


FIGURE  1.  SPECIFICATION  OF  PROGRAM  DESIGN  REQUIRQIENTS . 

In  the  case  of  automated  ATCS  training  systems,  the  design  phase  has 
several  additional  components.  First,  the  operational  requirements  (Figure 
1)  from  the  task,  knowledge,  and  skills  analysis  are  stated  in  terms  of  the 
functional  products  that  a  training  system  must  produce.  This  is  a  clear 
description  of  the  visible  workings/ outcomes  of  the  needed  training  system. 
The  functional  requirements  should  contain  only  the  essentials  necessary  to 


2 


simulate  the  required  operational  activity  for  the  determined  level  of 
training.  Figure  2  describes  this  step. 

Which  details  of  the  Reol  Operotionoi 
ATC  environment  MUST  be  Simulated  ? 

•  Specify  Essentials 

•  Eliminate  "Deslrements* 

•  “Freeze"  for  Duration 

•  Progrom  Monoger  has  Decision 
on  Future  Changes 

FIGURE  2.  FUNCTIONAL  REQUIREMENTS. 

It  is  at  this  stage  that  computer-derived  measures  to  assess  student 
performance  are  stated  as  a  functional  requirement.  Particular  care  must  be 
taken  in  this  phase  to  eliminate  any  unnecessary  requirements.  As  pointed 
out  by  Page  (2),  the  optimal,  cost-cf flclent  point  on  the  complexity 
function  Is  the  minimal  system  required  to  satisfy  the  needed  functional 
activity  (see  Figure  3). 


Cost 

to 

Oovolop 


System  Complexity 

FIGURE  3.  THE  COMPLEXITY  TRADEOFF. 

The  next  design  phase  concerns  the  engineer  more  than  the  educational 
technologist;  however,  the  educational  technologist  Is  Involved  In  this 
stage  and  should  be  aware  of  the  process.  This  phase  Is  the  design 

Competitive  Design: 

•  Select  System  Architecture 

•  Types  and  Size  of  Computers 

•  Software  Approach,  Languoge, 
Architecture 

•  Determines  Growth  Potential, 
Flexibility,  Maintainability, 
Reliability,  Etc. 

FIGURE  4.  THE  DESIGN  APPROACH. 


3 


approach*  As  Figure  4  points  out,  this  step  Includes  the  selection  of  the 
most  reliable  and  cost-eff Iclent  minimal  system  architecture.  The 
educational  technologist  acts  as  a  consultant  to  the  system  engineer  to 
Insure  that  the  selected  system  performance  will  satisfy  the  functional 
requirements. 

Page  (2)  points  out  several  reasons  trtiy  it  is  very  Important  to  make 
correct  judgments  about  the  system  during  the  design  approach  phase; 

(1)  The  developer  has  to  make  the  corrections; 

(11)  The  impact  on  program  cost  Is  less; 

(111)  The  cost  to  the  user  after  system  delivery  (maintenance) 

Is  much  less;  and 

(Iv)  The  system  will  be  less  troublesome  to  use  early  in  the 
operational  phase. 

Figure  5  further  illustrates  the  Impact  on  cost  of  making  errors  that 
must  be  corrected  later  in  the  development  process.  The  two  lines  on  the 
graph  show  the  relative  cost  for  making  a  large  number  of  errors  versus  many 
fewer  errors. 


L_^ - - 1  l~  ”  — —  I  — .  — - 

Preliminary  Oetoiled  Code  +  Integrate  Validate  Operation 

Design  Design  Debug 


FIGURE  5.  ERROR  VERSUS  COST. 

The  res.alnder  of  the  development  process  during  the  design  evaluation 
consists  of  the  detailed  design,  hardware  and  software  development,  and 
system  testing.  The  detailed  design  end  hardware  and  software  development 
are  engineering  tasks;  however,  the  educational  technologist  again  acts  as 
a  consultant  to  insure  that  the  product  satisfies  the  needs  of  the  training 
requirements.  Figure  6  depicts  the  entire  process.  The  system  testing 
phase  Is  particularly  Important  to  the  educational  technologist,  since  this 
is  a  demonstration  of  the  system's  ability  to  perform  the  functional 
requirements  as  specified.  Care  should  be  taken  to  insure  that  the  test  Is 
a  valid  demonstration,  covering  all  aspects  of  the  functional  requirements 


4 


FIGURE  6.  THE  DEVELOPMENT  PROCESS. 


with  the  stated  system  reliability.  Corrections  after  the  system  Is  set 
Intact  can  be  very  costly.  The  system  test  should  serve  as  the  final 
checkpoint  to  catch  all  remaining  bugs  in  the  system  and  Incongruenclcs  with 
the  functional  specifications. 

Implementation  Evaluation.  The  Implementation  evaluation  phase 
monitors  program  implementation  and  Insures  and  documents  that  the  program 
was  implemented  strictly  according  to  the  design.  Any  changes  made  to  the 
design  during  implementation  should  be  carefully  documented  and  the  design 
revised.  The  Implementation  evaluation  stage  Insures  that  the  stated 
process  is  operational.  Intact,  and  stable.  This  evaluation  is  generally 
accomplished  by  means  of  frequent  status  studies  during  the  implementation 
stage.  Data  Is  collected  (usually  by  surveying  the  responsible  personnel) 
on  each  aspect  of  the  process  and  a  determination  made  about  the  state  of 
Implemcnratlon.  Direct  observations  should  also  be  made  on  a  periodic 
schedule.  The  status  studies  are  generally  made  into  a  report  for 
decision-makers  with  suggestions  to  improve  or  expedite  implcanentation. 
Shortcomings  in  implementation  are  noted  in  each  report.  Figure  7  is  a 
flowchart  depicting  the  process  of  Implementation  evaluation. 


FIGURE  7.  FLOWCHART  OF  IMPLEMENTATION  EVALUATION  PHASE. 


Formative  Evaluation.  When  the  program  Is  determined  to  be 
operational.  Intact,  and  sufficiently  stable,  formative  and  summatlve 
evaluations  begin.  Formative  evaluation  Is  an  ongoing  process  that  Insures 
that  the  program  remains  on  target.  It  Is  the  process  of  continually 
collecting  data  and  statistics  related  to  training  criteria,  l.e.,  how  well 
students  are  doing  In  training.  This  Is  a  monitoring  process  to  gauge  the 
operational  stability  of  the  program  and  the  quality  of  students  coming  Into 
the  program.  It  is  also  a  method  for  monitoring  compliance  with  Equal 
Employment  Opportunity  Commission  (EEOC)  guidelines. 

The  data  base  for  formative  evaluation  should  be  extensive.  It  should 
contain  Information  for  each  individual  on  the  current  EEOC  and  Office  of 
Personnel  Management  (0PM)  minority  status  code,  all  pertinent  attitude 
Information  such  as  expectation  and  the  set/lnformatlon  given  to  them  prior 
to  coming  to  the  Academy,  Individual  and  composite  scores  for  selection 
tests,  other  information  used  for  points  In  selection  such  as  education, 
experience,  and  veteran's  preference,  pass/fail  information,  and  all 
training  scores  for  academic  and  lab  phases.  Item  responses  for  all  tests 
during  the  training  phase  should  also  be  maintained. 

On  a  periodic  basis,  statistics  and  reports  should  be  summarized  for 
research  purposes  and  for  transmittal  to  decision-makers.  Statistics  should 
include  sample  size,  means,  standard  deviations,  Intercorrclatlons, 
pass/fail  rates,  reliabilities  on  tests  and  labs,  tests  for  parallelism  on 
different  forms  of  the  same  measure,  and  item  parameters,  i.c.,  item 
difficulty.  Item  discrimination,  and  the  validation  of  parallel  laboratory 
problems  and  new  items  for  parallel  tests.  These  statistics  should  be 
maintained  on  record  In  both  computer  backup  files  and  hard  copy.  Further, 
the  statistics  should  be  calculated  by  Input  and  be  cumulative  up  to  and 
Including  the  most  recent  input.  Administration  formative  evaluation 
reports  should  Include  sample  size,  means,  and  Intercorrelations  on  all 
relevant  measures,  and  pass/fall  rates  stratified  by  minority  status,  sex, 
prior  experience,  predevclopmental/nonconpetltive  entry,  veteran's 
preference,  educational  level,  option,  and  region.  Appendix  A  contains 
sample  reports  for  formative  evaluation. 

When,  based  on  the  formative  summary  data,  there  appears  to  be  a 
problem  In  how  the  training  program  Is  running,  the  evaluator  has  the 
resonslbll Ity  to  alert  the  appropriate  administrative  personnel  and  prepare 
a  concise  report  Identifying  the  problem  areas.  Isolating  the  exact  area  of 
concern  may  require  some  mathematical  modeling.  The  attitude  information, 
where  appropriate,  should  be  employed  as  a  covariate  In  the  modeling. 
Modeling  will  be  discussed  In  detail  later. 

Summatlve  evaluation.  Summatlve  evaluation  is  a  continual  assessment 
of  the  quality  of  the  products  of  the  program.  While  formative  evaluation 
Is  summarized  on  an  Input-by-lnput  basis  and  serves  as  an  Immediate  feedback 
loop  for  ongoing  program  revisions  if  needed,  summatlve  evaluation  occurs  on 
a  larger  scale  across  a  longer  time  span  (e.g.,  on  a  yearly  basis). 

Formative  evaluation  Is  concerned  with  internal  program  accuracy  and 


stability,  program  reliability,  and  content  and/or  concurrent  demonstrations 
of  validity. (For  example,  arc  the  measures  reliable?  Are  the  objectives 
well  matched  with  curricula  content?  Do  the  pass/fail  rates  remain 
stable?)  Summativc  evaluation,  however,  is  a  check  on  the  quality  of  the 
output  from  the  stabilized  program.  The  summative  evaluation  is  a  test  of 
predictive  or  criterion  validity.  It  is  a  measure  of  the  on-the-job  success 
of  those  who  pass  the  Academy  training,  and  the  relationship  of  how  well  the 
candidates  performed  in  the  Academy  compared  to  how  well  they  performed 
on-the-job.  The  so-called  validity  coefficient  is  the  measure  of  this 
relationship. 

The  summative  data  base  should  consist  of  several  components.  It  is  a 
comprehensive  tracking  of  the  career  progression  of  every  successful  Academy 
candidate.  It  should  contain  data  for  every  individual  on  types  of 
facilities  where  the  person  has  been  employed,  measures  of  job  performance 
at  each  of  these  sites  (criterion  measures),  type  of  attrition  and  why, 
whether  a  person  changed  options  and  why,  whether  a  person  was  maintained  by 
the  agency  in  a  non-2152  (ATCS)  position,  and  as  much  attitude  and 
demographic  Information  as  possible  (c.g.,  divorce,  aspects  of  the  job  the 
person  likes  or  dislikes,  etc.). 

Statistics  and  reports  should  be  summarized  from  the  summative  data 
base  on  a  regular  schedule  for  research  and  as  information  for 
decision-making.  Statistics  should  include  sample  sizes,  means,  standard 
deviations.  Intercorrelations,  validity  coefficients,  attrition  rates,  and 


FIGURE  8.  FLOWCHART  OF  THE  GENERAL  PROCESS  FOR  BOTH  FORMATll'E 
AND  SUMl^ATIVE  EVALUATION. 


mathetnatical  modeling,  using  the  attitude  information  as  a  covarlate. 
Administrative  summatlve  reports  should  include  sample  size,  means, 
intercorrelations,  and  validity  coefficients  on  all  relevant  data,  and 
attrition  data  should  be  stratified  by  minority  status,  sex,  prior 
experience,  predevelopmental/non-competitlve  entry,  veteran's  preference, 
educational  level,  reasons  for  attrition,  2152/non-2152  attrition,  option, 
and  region. 

If  the  summatlve  data  base  demonstrates  a  problem  in  the  program,  a 
need  for  a  major  program  revision  may  be  indicated.  The  data  should  be 
reviewed  very  carefully,  employing  mathematical  modeling  to  isolate  the 
source  of  the  problem.  As  in  the  formative  evaluation,  the  decision-makers 
should  be  alerted  to  the  problem  but,  in  addition,  in  the  case  of  the 
summatlve  data,  policymakers  and  Academy  officials  should  be  alerted.  Major 
program  revisions  require  careful  planning  and  more  detailed  attention  than 
revisions  based  on  formative  data.  Appendix  B  contains  an  example  of  the 
summatlve  reports.  Figure  8  flowcharts  the  general  process  for  both 
formative  and  cummatlvc  evaluation. 

The  interaction  and  dynamic  nature  of  the  program  evaluation 
components .  Figure  9  contains  a  summary  of  the  four  components  of  the  ATCS 
program  evaluation  model.  The  descriptions  in  Figure  9  imply  an  interaction 
between  the  formative  and  summatlve  evaluations.  The  formative  evaluation 
is  designed,  through  constant  analyses  and  feedback  mechanisms,  to  serve  as 
a  guidance  system  in  keeping  the  program  on  track  toward  meeting  the  stated 
curricular  objectives.  It  serves  to  stabilize  the  methods  employed  in 
teaching  and  training  the  curricular  objectives.  The  summatlve  evaluation 
is  designed  to  Inform  policymakers  as  to  whether  the  methods  employed  in 
meeting  the  curricular  objectives  and/or  whether  teaching  to  these  stated 
objectives  actually  produces  a  successful  ATCS.  If  the  training  methods  are 
not  stable  or  the  curricular  training  objectives  are  not  met,  and/or  these 
shortcomings  are  not  detected  and  corrected  within  a  very  short  time  period 
by  the  formative  evaluation,  it  is  impossible  for  the  summatlve  evaluation 
to  determine  whether  the  present  training  methods  being  employed  and/or  the 
present  curricular  objectives  arc  producing  the  product  being  viewed.  The 
interaction  between  formative  and  summatlve  evaluation  is  depicted  in  Figure 
10. 


This  interaction  between  summatlve  and  formative  evaluation  has  several 
implications:  (1)  It  implies  that  a  program  should  be  very  carefully 

designed  and  implemented  initially.  As  previously  mentioned,  this  means 
performing  thorough  cask,  knowledge  and  skills  analyses  and  a  careful 
matching  between  the  job  samples  taken  from  the  analyses  and  the  curricular 
objectives,  assessment  techniques,  and  training  methods.  (11)  The 
interaction  also  Implies  that  the  summatlve  evaluation  assesses  how 
successfully  the  formative  evaluation  is  working.  An  unstable  program 
produces  confusing  and  inconsistent  summatlve  data,  (ill)  The  last 
implication  relates  to  program  changes;  When  con  program  revisions  be  made; 
how  large  a  change  can  be  made  based  on  formative  and  summatlve  data;  and 
what  type  of  evaluation  is  required  given  that  a  change  is  made. 


8 


DESIGN  IMPLEKUJTATION  FORMATIVE 

EVALUATION  EVALUATION  EVALUATION 


OBJECTIVE 


METHOD 


RELATION 

TO 

DECISION¬ 

MAKING 


To  state  goals;  |To  insure  that 
develop,  define,! the  design  was 
and  document  I  fully  and  cor- 
curricula,  ob-  Irectly  imple- 
jcctives,  mod-  jmcnted  and  is 
ules  and  assess- j Intact  and 
ment  tools,  De- (stable, 
velop  and  define! 
logically  con-  ! 
nected  step-by-  ( 
step  process  to  ( 
achieve  goals.  ( 

Automated  trai-  ( 
ning  systems  re-1 
quire  monitoring! 
the  system  ! 

development.  ! 


Careful  documen¬ 
tation  and  des¬ 
cription  of 
major  systems 
and  subsystems 
of  the  program; 
use  of  flow¬ 
charts,  PERT, 
tables,  graphs, 
and  general  sys¬ 
tems  analysis 
technology. 


•(Frequent  status 
(studies  with 
(data  indicating 
( the  extent  of 
1  implementation 
(for  each  area 
(of  the  process. 
(Regular  status 
1  reports  arc 
■(issued. 


Determining  the  (Status  reports 
best  plan  of  (offer  sugges- 
actlon  to  accom-(tlons  for  im- 
pllsh  the  pro-  (provement  and 
gram  objectives. ( indicate  where 
( implementation 
(is  falling 
( short. 


To  Insure  that 
the  program 
stays  on  target 
and  to  add  and 
evaluate  refine¬ 
ments  changes  to 
the  program  in  a 
systematic  man¬ 
ner.  More  con¬ 
cerned  with  pro¬ 
gram  reli¬ 
ability. 


SUMMATIVE 

EVALUATION 


(To  measure  the 
(quality  of  the 
(final  program 
(product.  What 
(is  the  program 
(payoff?  More 
(concerned  with 
(program  valid- 
(ity. 


Maintenance  of 
an  ongoing  data 
base  that  mea¬ 
sures  program 
stability  and 
collection  of 
data  that  would 
be.  sensitive  to 
any  change  in¬ 
troduced.  Regu¬ 
lar  reports  to 
management . 


Offers  informa¬ 
tion  on  program 
stability 
through  regular 
reports  and  in¬ 
formation  on  the 
effects  of  any 
program  change. 
Also  offers  in¬ 
formation  on 
EEOC  compliance. 


Collection 
and  analysis 
of  data  on 
the  success 
of  graduates 
of  the  pro¬ 
gram.  Peri¬ 
odic  reports 
on  the  data 
analysis. 


Offers  infor¬ 
mation  on  the 
quality  of 
the  program 
product  that 
is  especially 
useful  for 
long-term 
planning  and 
needed 
change. 


FIGURE  9.  SUMMARY  OF  THE  FOUR  CIMPONENTS  OF  THE  ATCS  PROGRAM 
EVALUATION  MODEL. 


SUMMATIVE  EVALUATION  KEEPS  PROGRAM  ON 
TRACK  TOWARD  MEETING  PROGRAM  OBJECTIVES. 


FIGURE  10.  INTERACTION  BETWEEN  FORMATIVE  AND  SUMMATIVE  EVALUATION. 

Program  changes  can  be  classified  as  (i)  program  adjustments,  (ii) 
changing  a  program  component,  (iii)  adding  or  subtracting  a  program 
component,  and  (iv)  a  major  program  restructuring. 

Program  adjustments  are  changes  that  affect  a  common  clement  across 
several  program  components-  They  arc  small  or  medium  changes.  Large 
program  adjustments  would  fall  under  the  category  of  major  restructuring. 
Program  adjustments  usually  take  the  form  of  changes  in  presentation  of 
lesson  material,  small  curricula  adjustments,  modifications  in  the  types  of 
assessment  devices,  or  changing  the  item  format  in  tests.  Generally,  the 
formative  evaluation  process  can  offer  sufficient  information  to  evaluate 
such  a  change.  In  a  small  number  of  cases  a  medium  program  adjustment  may 
require  summatlve  data  to  evaluate  the  change. 

Changing  a  program  component  can  vary  from  a  small  to  a  largo  change. 
Small  changes  would  include  changing  items  in  a  component  test,  reordering 
the  sequence  of  test  items,  or  minor  modifications  in  a  program  component 
curriculum.  Small  program  component  changes  can  be  sufficiently  evaluated 
by  the  formative  evaluation  data-  Medium  component  changes  would  include 
changing  the  sequence  of  the  component  in  the  program  or  a  medium  curriculum 
change  in  the  component.  Medium  changes  require  formative  evaluation  data 
and  usually  also  require  summatlve  evaluation  data.  A  large  change  is  a 
major  revamping  of  the  component  and  requires  design,  implementation, 
formative,  and  summatlve  evaluation. 

Adding  or  subtracting  a  program  component,  even  in  a  conservative 
sense,  represents  a  medium  or  (usually)  a  large  change.  In  either  case, 
data  required  to  evaluate  the  effect  of  adding  or  subtracting  a  component 
include  design,  implementation,  formative,  and  summatlve  evaluation. 

A  major  restructuring  of  the  program  essentially  requires  the  same 
process  as  a  beginning  program  and  involves  design,  implementation, 
formative  and  summatlve  evaluation.  Evaluation  of  a  major  restructuring 
should  place  more  emphasis  on  the  design  and  implementation  evaluation  than 


10 


any  of  the  other  types  of  change.  Program  changes  and  the  required  types  of 
evaluation  are  summarized  In  Figure  11. 


1  ACTIVITY 

1  SMALL 

1  MEDIUM 

1  LARGE  1 

1  Program 

1  Formative 

1  Formative 

1  H/A  1 

1  Adjustments 

1  Summatlve 

1  1 

1  Change  a 

1  Formative 

1  Formative 

1  Design  1 

1  Program 

1  Summatlve 

1  Implementation  | 

1  Component 

1 

1  Formative  I 

1 

1 

1  Summatlve  I 

1  Add/Subtract 

1  N/A 

1  Design 

1  Design  1 

1  a  Program 

1  Implementation  |  Implementation  | 

1  Component 

1  Formative 

1  Formative  I 

1 

1  Summatlve 

1  Summatlve  I 

1  Ma  j  or 

1  N/A 

1  N/A 

1  Design  1 

)  Program 

1 

1  Implementation  | 

1  Restructuring 

1 

1  Formative  I 

1 

1 

1  Summatlve  I 

FIGURE  11.  TYPE  OF  EVALUATION  REQUIRED  FOR  PROGRAM  CHANGES. 

The  operation  of  the  total  model  Is  dynamic  and  Interactive.  Each 
component  Is  dependent  on  the  correct  accomplishment  of  the  other 
components.  The  dependency,  while  overlapping.  Is  somewhat  linear. 

Accurate  summatlve  evaluation  depends  on  accurate  formative  evaluation, 
accurate  program  Implementation,  and  accurate  design.  Accurate  formative 
evaluation  depends  on  accurate  implementation  and  design,  and  so  forth  back 
to  the  task,  knowledge  and  skills  analysis  used  In  the  design  evaluation. 
Figure  12  is  a  schematic  path  diagram  of  the  Interaction  and  dependency 
among  the  four  components. 


FIGURE  12.  THE  INTERACTIVE  CLOSED  LOOP  PROGRAM  EVALUATION  STRUCTURE. 


11 


III.  Katbenatlcal  Models  in  Fornatlve  and  Sucanatlve  Evaluation. 

The  linear  model  and  Intervening  variables.  The  mathematical  models 
used  In  formative  and  summatlve  evaluation  center  on  principles  of  linear 
regression.  The  most  common  phenomenon  of  Interest  Is  how  each  measure  Is 
related  to  another,  l.e.,  the  regression  of  one  measure  on  another.  The 
simple  equation  for  linear  regression  Is: 


Y  -  a  +  bX,  (1) 

where  X  ••  independent  measures,  a  *  the  value,  where  the  regression  line 

1 

Intercepts  the  Y  axis,  b  ■  the  regression  coefficient  and  Y  «•  the  predicted 
dependent  scores.  Two  sets  of  measures  (X  and  Y)  can  be  plotted  as  In 


Y  •  1.0+  .90  X 

for  X-3.  Y»  1.0 +  {.90)3 
I 

Y  •  3.7 


r  ■  .90 

FIGURE  13.  PLOT  OF  TWO  SETS  OF  MFASURES. 

Figure  13.  As  viewed  in  the  plot,  the  regression  line  Intercepts  the  Y  axis 
at  1  and  the  slope  of  the  regression  line  ■  .90.  The  slope  indicates  the 
predicted  change  in  Y  for  each  unit  change  in  X.  Consequently,  it  is  easily 
seen  how  linear  regression  offers  a  means  to  predict  or  estimate  values  of 
one  measure  from  values  of  another  measure.  The  closer  the  data  points  on 
the  graph  are  to  a  straight  line,  the  better  the  prediction.  A  good  example 
is  plotting  Academy  scores  (X)  by  a  success  measure  on  the  Job. 

To  explicate,  suppose  a  researcher  were  to  take  measures  of  the  same 
phenomena  two  times  and  then  plot  the  two  occasions.  Identical  measurement 
processes  and  conditions  would  yield  a  graph  like  Figure  lA.  The  slope 
would  be  45  degrees,  placing  the  Intercept  through  the  origin  at  a  ■  0  and 
the  slope,  the  increment  in  Y  for  a  unit  change  in  X,  at  b  ••  1.0.  This 
result,  as  noted  above,  is  contingent  on  two  factors  in  the  process  being 
Identical:  (1)  perfectly  accurate  measures  on  both  occasions,  and  (11) 


Identical  relevant  conditions*  If  either  of  these  factors  were  altered,  the 
regression  slope  and/or  intercept  most  probably  would  change* 


Y 


FIGURE  14.  PLOT  OF  TWO  IDENTICAL  MEASURES. 

The  factors  that  would  alter  the  regression  line  in  the  above  example 
are  referred  to  as:  (1)  measurement  error  and  (ii)  intervening  variables. 
Figure  15  illustrates  the  concept. 


INTERVENING 

WADI  A  Dl  PC 

MEASURE  1 

AND 

MEASUREMENT 

ERROR 

- 

MEASURE  -#2 

FIGURE  15.  THE  CONCEPT  OF  INTERVENING  VARIABLES. 

Measurement  error  Is  usually  assumed  to  be  symmetrically  distributed  about 
the  true  value  and  consequently,  when  summed,  equals  zero  and  has  no  effect 
cn  the  analyses. 

One  of  the  major  uses  of  the  linear  model  in  program  evaluation  is  the 
identification  of  intervening  variables  and  determining  their  impact  on 
dependent  variables.  For  example,  what  intervening  variables  affect  the 
relationship  between  Academy  scores  and  field  success  (the  validity 
coefficient)?  In  experimental  design  the  intervening  variables  arc 
generally  viewed  as  independent  variables  and  the  measures  affected  as 
dependent  variables.  Suppose  one  wanted  to  determine  the  Impact  of 
motivation  (Independent  variable)  on  Academy  success  (dependent  variable). 
Several  methods  are  employed  to  identify  Intervening  variables  and  their 
impact  on  dependent  measures* 


13 


Methods  for  Identifying  and  determining  the  Impact  of  Intervening 
variables.  If  one  can  Identify  Intervening  variables  and  their  Impact  on 
dependent  measures,  then  a  transformation  equation  can  be  generated  to  map  X 
on  to  Y  and  the  relationship  between  X  and  Y  can  be  mathematically  explained 
and  quantified.  The  methods  employed  to  accomplish  this  depend  on  Che 
nature  of  the  variables  Involved  and  the  assumptions  one  employs  In  the 
model . 


Assuming  the  linear  model,  Che  process  can  be  explained  as  follows. 
Suppose,  for  cxa'i<>i)le,  we  have  two  measures  of  the  same  variable,  and  a  plot 
of  Che  measures  appears  as  In  Figure  16. 


FIGURE  16.  PLOT  ON  MEASURE  Pi,  MEASURE  P2,  AND  REGRESSION  LINES. 

Further,  suppose  we  suspect  that  an  Intervening  variable  Z  was  the 
reason  for  that  difference.  It  would  then  stand  to  logic  that  If  the 
measures  (Y  )  were  adjusted  to  account  for  the  Influence  of  Z  then  the 
1 

regression  lines  should  be  equal.  Returning  to  the  linear  equation  (1) 
1 

Y  -  a  +  bX,  (1) 


1 

we  can  see  that  the  estimated  Y,  (Y  ),  contains  some  error  of  estimation 
unless  all  the  X  and  Y  points  lie  on  a  straight  line.  So, 

I 

Y  -  Y  -  Y(E)  .  (2) 

II  1 

This  is  the  error  made  In  estimating  Y  for  the  element  1.  We  can  now  state 
an  equation  for  Y  using  the  linear  model, 
i 


Y  -  a  +  bX  +  Y(E) 
i  11 


14 


(3) 


Froa  equations  (2)  and  (3)  it  Is  also  easily  seen  that  T(E)  is  uncorrelated 
with  X  since  Y(E)  is  that  component  in  Y  which  is  unexplained  by  X. 


The  aim  of  the  adjustment  of  the  measures  (Y  )  is  to  remove  the 

1 

influence  (or  the  relationship)  of  Z  on  the  measures  (Y)«  Thus  applying  the 
linear  equation  to  estimate  Y  from  Z, 

1 

Y  -  A  +  bZ,  (4) 

we  can  produce 

1 

Y  -  Y  -  Y(E)  .  (5) 

1  1  i 

Y(E)  is  now  a  Y  value  minus  the  Influence  of  Z.  If  we  plot  the  Y(E) 
values  separately  for  M#1  and  Kl!*2  as  in  Figure  15,  we  have  a  picture  of  the 
plot  without  the  Influence  of  Z.  If  the  regression  lines  for  M#1  and  M#2 
are  identical  after  removing  the  Influence  of  Z,  then  Z  can  be  used  to 
explain  the  difference  in  the  regression  lines  for  Mil  and  Mf2  and  Z  can  be 
used  to  adjust  Y  to  equate  MlFl  and  H#2  regression  lines*  This  procedure 
obviously  assumes  parallel  regression  lines  since  it  is  the  intercept  that 
ia  being  adjusted*  Consequently,  the  adjustment  to  Y  can  be  stated  in  the 
linear  equation  as, 

Y(A)  -  Y  -  b<Z  -a)  +  Y(E)  ,  (6) 

111  1 

where  Y(a)  •  the  adjusted  Y  for  element  i.  Or,  the  estimated  adjusted 
Y  is, 

1 

Y(A)  -  Y  -  b(Z  -a)*  (7) 

11  1 

The  method  just  explained  for  removing  the  influence  of  Z  is  • 
univariate  process,  checking  one  variable  at  a  time*  Often,  intervening 
variables  that  account  for  the  differences  in  regression  lines  may  be 
correlated*  Consequently,  the  effects  they  account  for  are  not  additive,  as 
seen  in  Figure  17* 

Univariate  analyses  can  help  select  variables  in  which  their  total 
effects  (unique  effect  +  shared  effect)  are  shown;  however,  if  two  or  more 
intervening  variables  share,  to  a  large  degree,  in  their  effect  on  the  Y 
regression  lines  under  two  separate  conditions.  Mil  and  Mf2,  then  nil  the 
variables  are  not  needed  to  explain  the  difference  nor  to  adjust  Y*  One 
must  employ  an  analysis  that  utilises  the  unique  contribution  of  variables 
in  producing  the  differences  in  Mil  and  MF2  without  the  spurious  addition  of 
overlapping  effects  shared  by  one  or  more  variables* 


FIGURE  17.  OVERLAP  OF  INTERVENING  VARIABLES  Z  AND  Z  WITH  Y 

1  2 

AND  THEMSELVES. 

Multiple  regression.  Multiple  linear  regression  Is  a  method  of 
analyzing  the  shared  and  unique  contributions  of  more  than  one  Independent 
variable  X  (l>l...k)  to  the  variation  of  one  Independent  measure,  Y 
1 

(Kerllngcr,  1973).  By  variation  we  mean  how  the  measures  In  Y  are  different 
from  each  other.  For  example.  If  all  Academy  candidates  were  equally 
successful  on-the-job  regardless  of  their  Academy  scores  or  selection  test 
scores,  then  Che  variance  In  the  success  measure  Is  zero  and  we  have  no  need 
to  analyze  the  measure. 

Ttie  model  for  multiple  regression  Is  an  extension  of  our  previous 

1 

single  variable  regression  of  X  on  Y.  For  an  estimated  Y,  Y  , 


J 


a+bX  +bX  +...+bX, 
11  2  2  k  k 


(8) 


and  for  Y, 
Y 


a+bX+bX  + 
11  2  2 


+b  X  +  c. 
k  k 


(9) 


The  procedures  In  multiple  regression  ate  an  extension  of  the  previous 
dlcusslon  on  adjusting  Y  for  the  influence  of  a  third  variable,  Z,  prior  to 
correlating  X  and  Y.  In  multiple  regression  the  variance  In  Y  Chat  can  be 
explained  by  the  first  variable  is  partlaled  out.  The  remaining  variance  In 
Y  that  can  be  explained  by  the  second  variable  without  duplicating  or 
overlapping  that  expressed  by  the  first  variable  Is  then  partlaled  out. 

This  process  continues  until  all  the  Independent  variables,  X  ,  have  been 

1 

considered.  The  relationship  of  X  and  Y  without  duplication  or  overlap 

1 


1 


2 

among  X  Is  termed  the  multiple  R«  The  multiple  R  squared,  R  ,  expresses 
1 

the  proportion  of  variation  In  Y  explained  by  X  •  If  all  X  were 

1  1 
2 

uncorrelated,  not  duplicative,  then  the  multiple  R  would  be  the  simple  sum 
of  all  the  squared  correlations,  r  (see  Figure  13  to  review  "r"),  of  X  and 

1 

Y. 

2  2  2  2 

R  -r  +r  +...+r  (10) 

y.l2...k  yl  y2  yk 

2  2 

However,  If  the  X  are  correlated,  then  R  Is  the  sum  of  all  the  r  of  X 
1  1 
and  Y  with  the  duplication  and  overlap  partlaled  out* 

2  2  2  2 

R  -r+r  +...+r  ,  (11) 

y.l2...k  yl  y(2.1)  y(k.l2. . .k-1 ) 

2 

where  r  Is  read,  the  correlation  of  variable  2  and  Y  with  the  effects 

y(2.1) 

of  variable  1  partlaled  out* 

Discriminant  analysis*  Since  the  Academy  programs  are  pass/fall.  It  Is 
a  common  question  to  ask  which  measures  best  discriminate  between  passing 
students  and  falling  students*  As  an  example,  suppose  we  had  measures  on 
motivation,  level  of  education,  prior  experience,  and  Academy  scores  and  we 
wanted  to  know  which  of  these  best  discriminated  between  students  who  pass 


X 

.1 


FIGURE  18* 


I 

L 


X 


2 


PLOTTED  DATA  FOR  THE  TWO  VARIABLES,  TWO  GROUP  CASE. 


17 


field  training  to  full  performance  level  (FPL)  and  those  vlio  fall.  Then  we 
would  perform  a  discriminant  analysis.  To  explain  this  procedure,  we  go  to 
the  moat  simple  case.  Suppose  we  have  the  simple  case  of  two  suspected 
Intervening  variables  and  two  groups.  If  the  data  for  the  two  groups  were 
expressed  on  a  graph  where  the  axes  were  the  two  predictor  variables  x  and 

1 

X  ,  the  data  could  be  shown  as  coordinates  of  the  two  variables. 

2 

Forming  the  mlghted  sum  of  the  Intervening  variables  would  create  a 
new  variable,  T. 

1 

Y  -  V  X  V  X  ,  (12) 

11  2  2 

where  v  >  the  weights  employed.  This  may  be  recognized  as  another  linear 
equation  similar  to  those  previously  discussed  under  "multiple  regression." 
The  question  Is  how  to  express  the  measure  Y  on  our  graph,  or,  more 
accurately,  how  can  a  Y  axis  be  Indicated  In  the  way  x  and  x  arc?  The 

1  2 

answer  Is,  the  desired  axis  can  be  demonstrated  by  locating  the  coordinates 
represented  by  the  two  weights,  v  and  v  ,  and  drawing  a  line  from  this 

I  2 

point  to  Che  origin  of  the  x  and  x  axis  (sec  Figure  19).  The  data 

1  2 

coordinates  can  now  be  projected  onto  the  new  Y  axis  as  separate 
distributions  for  each  group. 

The  following  is  a  representation  of  the  scheme  described  above  for 
four  different  weighted  sums  of  x  and  x  . 

1  2 


FIGURE  19.  PROJECTIONS  OF  TWO  GROUPS  ON  FOUR  AXES  REPRESENTING 
LINEAR  COMBINATIONS  OF  THE  ORIGINAL  VARIABLES. 


It  can  be  noted  from  the  representation  that  the  projected 
distributions  on  the  Y  axes  are  separated  differently.  Soaie  of  the 
projected  distributions  overlap  more  than  others.  The  ptoblon,  then.  Is  to 
define  the  Y  axis  In  such  a  manner  that  the  projected  distributions  overlap 
the  least.  Obviously,  In  order  to  do  that,  a  means  to  measure  the  overlap 
must  be  determined. 

One  means  to  define  the  overlap  might  be  to  subtract  the  means  of 
Y  and  Y  and  divide  that  d.lfferencc  by  the  standard  deviation  of  one 
1  2 
of  the  groups. 


Y  -  Y 
1  2 

-  (13) 

S 

y 

However,  this  would  express  the  difference  only  In  terms  of  one  of  the 
standard  deviations. 

A  more  equitable  way  to  do  this  Is  to  pool  the  within  groups  standard 
deviation.  This  Is  accomplished  in  the  following  manner. 


where  S  Is  the  pooled  within  groups  standard  deviation,  n  Is  the 
y(w) 

2 

group  sample  size,  and  S  Is  the  variance  for  each  group. 

y 

Now,  a  more  stable  and  equitable  measure  of  overlap  can  be  expressed 
_  2 

(Y  -  Y  ) 


2  1  2 

f  - - ,  (15) 


2 

S 

y(w) 

where  f  Is  the  measure  of  overlap  in  the  two  distributions. 
To  extend  this  measure  to  more  than  two  groups, 


2  VARCf) 

f  - - ,  (16) 

k  2 
S 

y(w) 

where 

k  _  _  2 
E  (Y  -Y.) 

_  g 

VAR(Y)  - - ,  (17) 

k-l 

with 


Y  +  Y  +  ...  +  Y 


1  2  k 

Y.  - ,  (18) 

k 

and 

k  2 

E  (n  -1)S 

2  g-l  g  y(g) 

S  - - ,  (19) 

y(w)  N-k 


this  being  the  within  groups  mean  square,  MS 

(w) 

In  order  to  take  unequal  n  into  account  in  the  numerator  above, 

k  _  _  2 

E  n  (Y  -Y.) 

E“1  g  (g) 

MS  - - ,  (20) 

(b)  k-l 

where 


En  Y 
g  <g) 

. . .  (21) 

N 


and  Y,  is  the  grand  mean  of  Y  in  the  total  sample  in  all  k  groups. 

20 


■  ■  '  f 


Collecting  things  together,  we  have. 


MS 

2  (b) 


k  MS 

(w) 


(22) 


with  MS 


mean  squares  between  and  MS 


(b)  (w) 

This  formula  is  generally  expressed  as. 


mean  squares  within. 


SS 

/(k-1) 

SS 

(b) 

(b) 

SS 

/(N-k) 

SS 

(w) 

(w) 

and,  since  the  multipliers  (N-k)/(k-l),  are  constant  for  any  given  problem, 
they  can  be  omitted,  yielding 


SS 

(b) 


SS 

(v) 


(24) 


where  SS  ■>  sum  of  squares  betvjeen  and  SS  •=  sum  of  squares  within, 
(b)  (w) 


I:  _  _  2 

SS  -  E  n  (Y  -Y.)  , 

(b)  g-l  g  (g) 


(25) 


and 


k  _  _ 

SS  •  E  (Y  -Y.) 

(w)  g-l  (g) 


n 


k  g  _  2 

•=  E  E  (Y  -Y  )  , 

g-l  1-1  (g)i  (g) 

Y  being  the  Y  score  of  the  ith  individual 


(g)l 

quantity  h  is  termed  the  criterion. 


in  the  gth  group. 


(26) 


The 


21 


The  problem,  then,  in  performing  this  analysis  Is  to  express  the 
criterion,  h,  as  a  function  of  the  weights  v  ,  v  ,  ...,  v  ,  and  to 

1  2  p 

determine  by  differential  calculus  the  set  of  weights  wliich  maximize  h.  The 
weights  then  express  the  relative  contribution  of  each  Intervening  variable 
In  explaining  the  differences  In  the  two  groups. 

Summary  of  mathematical  models.  Basically  three  examples  of  linear 
models  were  described,  as  well  as  the  general  notion  of  linear  regression. 
The  three  examples  are  by  no  means  inclusive  of  all  linear  models;  however, 
the  ones  presented  are  the  most  frequently  used  In  the  ATCS  program 
evaluation  model.  Linear  regression  models  are  particularly  useful  in 
program  evaluation  since  the  major  function  of  any  screening  program  Is  to 
best  predict  on-the-job  success  and  to  determine  the  most  efficient  subset 
of  measures  that  can  be  used  to  do  that.  Linear  regression  is  also  very 
useful  for  estimating  the  impact  of  various  proposed  changes  to  the  program. 
Without  mathematical  models,  program  evaluation  would  be  extremely  difficult 
at  best* 


IV.  Summary. 

The  Systems  Analysis  Research  Unit  at  CAMI  has  developed  a  generic 
model  for  Academy  training  program  evaluation.  The  model  will  serve  as  a 
basis  for  Integrating  the  total  data  base  into  a  common  format  across  all 
training  programs.  The  model  consists  of  four  components;  (1)  design,  (11) 
implementation,  (ill)  formative,  and  (Iv)  summatlve  evaluation.  Design 
evaluation  Is  an  assessment  of  the  comprehensive  Implementation  plan; 
Implementation  evaluation  Is  a  determination  that  the  plan  Is  completely  and 
accurately  implemented  according  to  prescription;  formative  evaluation  is  a 
continual  monitoring  of  the  program  to  keep  the  process  reliable,  stable, 
and  on  track;  and  suirnnatlve  evaluation  monitors  the  product  of  the  training 
program.  The  design  evaluation  relies  on  the  task,  knowledge,  and  skills 
analysis  and  on  the  documents  in  the  Implementation  plan.  The 
implementation  evaluation  makes  use  of  date  from  frequent  status  studies. 
Fopjatlve  and  summatlve  evaluations  make  use  of  statistics  and  mathematical 
modeling,  primarily  linear  regression  models,  to  monitor  the  process  and 
products  of  the  programs  and  to  estimate  and  determine  the  impact  of  changes 
made  to  the  programs. 


22 


REFERENCES 


Kerllnger, 
York:  Holt 

Page,  L.  F. 
PROCEEDINGS 
EXPOSITION, 

Stuf flebeam 
Itasca,  Ill 


•  Multiple  Regression  In  Behavioral  Research.  New 
Rhlnehart,  and  Winston,  1973. 

Technology  In  Air  Traffic  Control  Training  and  Simulation 
OF  THE  THIRD  INTERNATIONAL  LEARNING  TECHNOLOGY  CONGRESS  AND 
2:212-216,  1980. 


D.  L.,  et  al.:  Educational  Evaluation _and  Decision  Maklna 
:  F.  E.  Peacock  Publishers,  Inc.,  1971. 


23 


APPENDIX  A 


Examples  of  Reports  for  Formative  Evaluation 


^  to  ^  M  M 


^  o  o  »<» 


S9»!!:8SSS:S88 

88888888888 


§ 

s  & 
8  i 


i!“ 

s  u 


IL 


88888888888 


o  o  o 


88888888888 


•"a  8 

*»  T 


-• 

a 

!* 


|«8«888888S 


888B8888888  a 

*8RR8«*®8®®8  R 

^  nm 

o^ro%o^oo^oof^  ^ 


«  c; 

i  1 1 1 1 

I  s  I  i  § 


I 


CMI>  SYSTEHS  ANALYSIS  RESEARCH  UNIT 
9-JUN-81  REPORT  NUMBER  80RXE12 


( 

hOM.»0 

<r  o-r^ 

S8? 

Rss; 

^  rof^. 
PO  O  'O 

MTO^rs. 

O  Oho 

f  1 

S 

•oir>  *0 

mm  wm  mm 

CO  *4(0 

oo^r< 

^C'i  CM 

CM*«C0 

r.j  m  «•« 

^OO 

'^or^ 

mm 

il 

1 

m  inr>i 
om^ 

*oo  mm 

CM  «0  0 

m  CMO 

Mr  mo 

ftSS 

rooio 

?  1 
S  1 

$£SS 

:5RR 

O-  '‘Cirv 

mr^  •« 

cors^eo 

CP^  ooo^ 
^ 

m  ^  m 
mf^  •« 

fefc? 

sgr.'R 

1 

1 

29 

29 

29 

R?4?4 

ft 

29 

29 

29 

(>•  cr-  ^ 

tMCM« 

RRR 

R?^.R 

1 

1 

i  n. 

o>rMrs 

SRi? 

^o*-« 

roo^co 

c».  ^  ro 

r:g5 

3:ss 

mc^r^ 

«PO>Mr 

8018 

<^40  0^ 

'«0*-CM 
mm  mm 

^«-4CM 
mm  mm 

mm^C» 

^  CO  ^ 

fm  mm 

CM^OO 

mcom 

3 

s;sK 

SSS 

s«s 

RSg 

mgra 

S2i!3 

«R« 

a 

RfeR 

:SiS^ 

CM 

•^cors. 

v^MCM 

^eor^ 

:SS?: 

««tfV.<-0 

cor*.!^ 

sr 

QOQOOO 

CD  CO  00 

CO  CO  CO 

CO  CO  CO 

8 

8 

8 

CO  CO  CO 

Co  coco 

00  00  CO 

oo»^r-i 

U^*<tC-4 

^11?^ 

ocor^ 

W'k  OCM 
CO^-NT 

CO  com 
AT* 

g^r^ps... 

OOCOi/> 

O^^CM 

Ml  mo 

:sgp; 

2R5 

1  c3 

1 

V^CN*<k 

CM  CM  CM 

«-« uao- 
CaCM  «•< 

TacmCM 

iss 

22* 

o 

00 

1 1 

ooo 

ooo 

-oo 

ooo 
•<CN  <0 

c^oo 
CO  So  CM 

gss 

sss 

O  OO 
CmMT  ^ 

SS9 

1  UJ 

■  x: 

CM  00  O') 
o-olieo 

'-■JfOOO 

'«K  *<0 

<009^ 
^  r-v  w“i 

^  C'J  *0 

•0  0^00 
«r  -«  m 

mmt*i 

mp^*o 

sas 

1  ^ 

b'iinio 

tntmn 

tr>4r»w'i 

U0(/)40 

m  koin 

mm«o 

u”)  mm 

mmm 

lOOfv. 

^  roCM 

m  oco 

o, 

'OCs.f^ 

f-. 

eomo- 

S;?2fi 

Kfig 

'O 

mm  mm  mm 

•CCn. 

CMO^ 

*■* 

rooD  o 

(Mv^CM 

•OMrm 

s 

3 

SSr: 

mm  m*t>0 

SSK^ 

ZZS 

ssc 

fs^^fv 

mom 

SSS 

O'ChO 

m  no 

C4C01O 

«Ck^<<0 

SffS 

S?P?5 

fcRiS 

ae 

r-f**  rs. 

r*»fvfs. 

fV  fVfN» 

pKrvfx 

rsrs.rM. 

r^rMCv 

1 

1 

qa^nrv 

in’^rs 

sst:^ 

PO  ^  o 

*^00^ 

Mr  Cvint 
OCOlK 

:cs 

RftR 

g;s$ 

in 

o 

1  s 

1 

1 

1 

♦  ^  W- 

r)ooo 

CDCMf** 

mm  mm  O 
CMCM^ 

v^CMO 

R22 

so^MT 

mm^ 

5 

I  $ 

sf;s 

^00 

^r-.m 

sS5 

J®fe5 

r:83; 

Rt:;? 

|i 

CM40^ 

<nrv-o 

mrv-« 

»S^ 

^;C!S 

Rfea 

CRR 

I  z 

9 

9 

9 

25 


s 


I 

CSI 


ii 


‘“I  S 


2 

M 


I 


I 


72 

I  • 


I 


I 


g 


\  " 

M 


Si 

£;  I 


^ ! 


!  U 


ss 

88 

88 

88 

8 

88 

88 

88 

88 

8 

go 

go 

8® 

OO 

8 

g 

8® 

8® 

8® 

OO 

8 

«>4 

w 

w 

w 

«  * 

O^O 

•Jo 

OO 

» 

I 

o 

♦  • 
OO 

•Jo 

wO 

OO 

• 

w 

ss 

88 

88 

88 

8 

I 

88 

88 

88 

88 

8 

OO 

OO 

OO 

OO 

O 

8® 

8* 

w 

8<» 

«a4 

OO 

8 

OO 

*  » 
OO 

*  « 
OO 

OO 

-  g 

ft 

iS 

C*N. 

i 

- 

•  • 
oO 

•  • 
WO 

•  • 
WO 

*  * 
OO 

w 

S8 

88 

88 

88 

5  ®9 

S9 

S8 

SB 

So 

8 

88 

®8 

R  3 

w 

"S 

«8 

ora. 

8 

inr^ 

orC 

♦  • 
om 

orJ 

t 

II 

i 

• 

R~ 

*  • 

«  • 
w^ 

C4 

OCM 

rC 

CM 

: 

M 

8 

88 

88 

88 

88 

S8 

88 

8a 

88 

8 

88 

88 

8S 

*8 

mo 

O 

SB 

ss 

0*0 

8 

. 

OC4 

or^ 

• 

• 

”a 

mo 

CM 

OCM 

B 

88 

88 

88 

88 

8 

88 

88 

88 

88 

8 

OO 

OO 

OO 

OO 

O 

s 

OO 

OO 

OO 

OO 

O 

OO 

OO 

OO 

OO 

o 

o 

OO 

•  » 
OO 

OO 

OO 

O 

88 

88 

88 

88 

8 

P 

88 

88 

88 

88 

8 

80 

8*^ 

8® 

OO 

8  1 

iQ 

OO 

OO 

OO 

OO 

O 

•• 

*:  1 

Eo 

B 

•io 

WO 

o^O 

OO 

"  i 

a 

• 

OO 

OO 

OO 

OO 

88 

88 

8Si 

88 

s 

!ai 

88 

88 

88 

88 

8 

go 

88 

88 

OO 

8  SS 

5 

OO 

OO 

OO 

OO 

O 

T 

OO 

Ifl 

S’® 

wo 

w 

wO 

OO 

s 

II 

s 

■® 

OO 

OO 

OO 

OO 

o 

8 

88 

!2S 

f^rv 

88 

w 

o 

88 

88 

88 

88 

8 

8-® 

88 

88 

OO 

R 

OO 

OO 

OO 

OO 

O 

f3^ 

cTo 

rJo 

OO 

o 

OO 

OO 

•  • 
OO 

OO 

O 

I  4»  OO  OO 


(A 

UJ 

<nP 


SS  ai 

si  ii  is  ii  i 


S  M 

St:;  s 

gS  s|  SB 

si  ii  is  ii  I 


26 


^  .jrv- 


1 


APPENDIX  B 


EXAMPLE  OF  A  SOMMATIVE  REPORT 


EN  ROUTE  TRACKING  STUDY 


GROUP 

SIZE 

PERCENT 

PASS  ACADEMY 

824 

88.02 

FAIL  ACADEMY 

60 

8. 52 

NO  SHOW  OR  WITHDRAW 

32 

3.42 

TOTAL 

936 

100.02 

STILL  ACTIVE  IN  2152 

(OF  PASSES) 

NOT  ACTIVE  IN  2152 

698 

84.72 

(OF  PASSES) 

126 

15.32 

TOTAL  PASSES 

624 

100.02 

TOTAL  STILL  ACTIVE 

696 

74.62 

TOTAL  NOT  ACTIVE 

236 

25.4% 

TOTAL  DEVELOPMENTALS 

936 

100.02 

FOR  THOSE  STILL  ACTIVE  IN  2152  OPTION 
(N-69f!); 

LAB  PHASE  COMPOSITE  SCORE  -  81.7 

SUPERVISOR  RATING  SCORE  -  5.1 

CORRELATION  OF  LAB  COMPOSITE  WITH 

SUPERVISOR  RATING  -  .212 

CORRECTED  FOR  RESTRICTION  -  .295 


TERMINAL  TRACKING  STUDY 


GROUP 

SIZE 

PERCENT 

PASS  ACADEMY 

868 

69.32 

FAIL  ACADEMY 

85 

8.7% 

NO  SHOW  OR  WITHDRAW 

19 

2.02 

TOTAL 

972 

100.02 

STILL  ACTIVE  IK  2152 

(OF  PASSES) 

NOT  ACTIVE  IN  2152 

789 

90.92 

(OF  PASSES) 

79 

9.12 

TOTAL  PASSES 

866 

100.02 

TOTAL  STILL  ACTIVE 

789 

81.22 

TOTAL  NOT  ACTIVE 

183 

18.82 

TOTAL  DEVELOPMENTALS 

972 

100.02 

FOR  THOSE  STILL  ACTIVE 
(N-769): 

IN  2152 

OPTION 

LAB  PHASE  COMPOSITE  SCORE 

• 

00 

■ 

SUPERVISOR  RATING  SCORE 

-  5.3 

CORREUTION  OF  LAB  COMPOSITE  WITH 
SUPERVISOR  RATING  -  .245 

CORRECTED  FOR  RESTRICTION  -  .334 


•UJ.OOV[IMMCNTnllNTINaOfFIC(  1982  369-«06/7«l  1-3 


